Skip to content

fix(bedrock): cache trailing message for stable prefix across agent turns#8916

Open
carl-auctane wants to merge 1 commit intoaaif-goose:mainfrom
carl-auctane:fix/bedrock-cache-trailing-breakpoint
Open

fix(bedrock): cache trailing message for stable prefix across agent turns#8916
carl-auctane wants to merge 1 commit intoaaif-goose:mainfrom
carl-auctane:fix/bedrock-cache-trailing-breakpoint

Conversation

@carl-auctane
Copy link
Copy Markdown

Summary

When BEDROCK_ENABLE_CACHING=true, BedrockProvider::converse() currently places cache points on the first three visible messages. This doesn't match how Anthropic and Bedrock actually look up cached prefixes: cache entries are keyed by the hash of the prefix ending at the breakpoint, and reads walk backward up to 20 blocks looking for prior writes. With a breakpoint fixed to early messages, every turn reprocesses everything appended after position 3, which in an agent loop grows linearly with turn count.

This change places the cache point on the trailing message instead. On each new turn the lookback finds the breakpoint the previous turn wrote, so fresh processing is bounded to the content added since the last request. This matches the pattern Anthropic's prompt caching documentation recommends for growing conversations:

Place cache_control on the last block whose prefix is identical across the requests you want to share a cache. In a growing conversation the final block works as long as each turn adds fewer than 20 blocks.

The system-prompt cache point is unchanged. The misleading comment about "caching recent messages would shift positions each turn" is replaced with an accurate description of the lookup model.

For a worked cost comparison over a 10-turn agent loop and links to the relevant Anthropic and Bedrock docs, see the linked issue.

Testing

  • cargo fmt --all
  • cargo check -p goose
  • cargo clippy -p goose --all-targets -- -D warnings (no warnings)
  • cargo test -p goose --lib providers::bedrock (4 passed)
  • cargo test -p goose --lib providers::formats::bedrock (11 passed)

No test changes were needed. The existing per-message helper tests in providers::formats::bedrock exercise to_bedrock_message_with_caching directly with enable_caching=true and are unaffected. The test_caching_* tests in providers::bedrock assert on should_enable_caching() returning the right boolean and are also unaffected.

Related Issues

Relates to #8915

…urns

The BEDROCK_ENABLE_CACHING=true path currently places a cache point on
the first three visible messages. This does not match how Anthropic and
Bedrock look up cached prefixes.

Cache entries are keyed by the hash of the prefix ending at the
breakpoint, and reads walk backward up to 20 blocks looking for prior
writes. With the first-3 strategy, each new turn's breakpoint sits at
a fixed position early in the conversation, so everything appended
after it is reprocessed fresh on every turn. In an agentic tool-use
loop this grows linearly with turn count.

Place the cache point on the trailing message instead. On each new
turn the lookback finds the breakpoint the previous turn wrote, so
fresh processing is bounded to the content added since the last
request. This matches the pattern Anthropic's prompt caching
documentation recommends for growing conversations.

See https://platform.claude.com/docs/en/build-with-claude/prompt-caching
and https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html

Signed-off-by: Carl Youngblood <carl.youngblood@auctane.com>
Bojun-Vvibe added a commit to Bojun-Vvibe/oss-contributions that referenced this pull request Apr 29, 2026
- aaif-goose/goose#8916 fix(bedrock): cache trailing message for stable prefix across agent turns (merge-as-is)
- aaif-goose/goose#8904 fix(oidc-proxy): validate exp independently of MAX_TOKEN_AGE_SECONDS (merge-as-is — security fix with test inversion in same commit)
Bojun-Vvibe added a commit to Bojun-Vvibe/oss-contributions that referenced this pull request Apr 30, 2026
Gemini-cli MessageBus.request() fail-fast on publish failure (fixes #22588 60s silent hang) and goose Bedrock prompt-cache placement fix (move cache_control from first-three to trailing message to align with prefix-keyed 20-block lookback). INDEX appended with drip-196 verdict-mix and PR table.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant